Search CORE

22 research outputs found

plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters

Author: Blin Kai
Kautsar Satria A.
Medema Marnix H.
Osbourn Anne
Suarez Duran Hernando G.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Plant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered in specific genomic loci: Biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery.</p

Wageningen University & Research Publications

Online Research Database In Technology

MIBiG 2.0: a repository for biosynthetic gene clusters of known function

Author: Blin Kai
Charkoudian Louise K.
Kautsar Satria A.
Navarro-Munoz Jorge C.
Shaw Simon
Publication venue: Haverford Scholarship
Publication date: 01/01/2020
Field of study

Haverford College: Haverford Scholarship

antiSMASH 4.0—improvements in chemistry prediction and gene cluster boundary identification

Author: Blin Kai
Breitling Rainer
Chevrette Marc G.
Dickschat Jeroen S.
Emmanuel de los Santos L. C.
Kautsar Satria A.
Kim Hyun Uk
Lee Sang Yup
Lu Xiaowen
Medema Marnix H.
Mitchell Douglas A.
Nave Mariana
Schwalen Christopher J.
Shelest Ekaterina
Suarez Duran Hernando G.
Takano Eriko
Weber Tilmann
Wolf Thomas
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Many antibiotics, chemotherapeutics, crop protection agents and food preservatives originate from molecules produced by bacteria, fungi or plants. In recent years, genome mining methodologies have been widely adopted to identify and characterize the biosynthetic gene clusters encoding the production of such compounds. Since 2011, the ‘antibiotics and secondary metabolite analysis shell—antiSMASH’ has assisted researchers in efficiently performing this, both as a web server and a standalone tool. Here, we present the thoroughly updated antiSMASH version 4, which adds several novel features, including prediction of gene cluster boundaries using the ClusterFinder method or the newly integrated CASSIS algorithm, improved substrate specificity prediction for non-ribosomal peptide synthetase adenylation domains based on the new SANDPUMA algorithm, improved predictions for terpene and ribosomally synthesized and post-translationally modified peptides cluster products, reporting of sequence similarity to proteins encoded in experimentally characterized gene clusters on a per-protein basis and a domain-level alignment tool for comparative analysis of trans-AT polyketide synthase assembly line architectures. Additionally, several usability features have been updated and improved. Together, these improvements make antiSMASH up-to-date with the latest developments in natural product research and will further facilitate computational genome mining for the discovery of novel bioactive molecules

Wageningen University & Research Publications

Portsmouth University Research Portal (Pure)

Warwick Research Archives Portal Repository

The University of Manchester - Institutional Repository

Online Research Database In Technology

University of Queensland eSpace

Biosynthetic potential of the global ocean microbiome

Author: Acinas Silvia G.
Bhushan Agneya
Bork Peer
Bowler Chris
Carlström Charlotte I.
Carroll Laura M.
Clayssen Quentin
Cronin Dylan R.
Delmont Tom O.
Forneris Clarissa C.
Gasol Josep M.
Gehrig Daniel
Gossert Alvar D.
Hubrich Florian
Kahles André
Karasikov Mikhaill
Kautsar Satria
Larralde Martin
Lotti Alessandro
Milanese Alessio
Mustafa Harun
Paoli Lucas
Papadopoulou Chrysa
Piel Jörn
Robinson Serina L.
Ruscheweyh Hans-Joachim
Salazar Guillem
Sullivan Matthew B.
Sunagawa Shinichi
Sánchez Pablo
Wincker Patrick
Zayed Ahmed A.
Zeller Georg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

8 pages, 4 figures, supplementary information https://doi.org/10.1038/s41586-022-04862-3.-- This Article is contribution number 130 of Tara OceansNatural microbial communities are phylogenetically and metabolically diverse. In addition to underexplored organismal groups1, this diversity encompasses a rich discovery potential for ecologically and biotechnologically relevant enzymes and biochemical compounds2,3. However, studying this diversity to identify genomic pathways for the synthesis of such compounds4 and assigning them to their respective hosts remains challenging. The biosynthetic potential of microorganisms in the open ocean remains largely uncharted owing to limitations in the analysis of genome-resolved data at the global scale. Here we investigated the diversity and novelty of biosynthetic gene clusters in the ocean by integrating around 10,000 microbial genomes from cultivated and single cells with more than 25,000 newly reconstructed draft genomes from more than 1,000 seawater samples. These efforts revealed approximately 40,000 putative mostly new biosynthetic gene clusters, several of which were found in previously unsuspected phylogenetic groups. Among these groups, we identified a lineage rich in biosynthetic gene clusters (‘Candidatus Eudoremicrobiaceae’) that belongs to an uncultivated bacterial phylum and includes some of the most biosynthetically diverse microorganisms in this environment. From these, we characterized the phospeptin and pythonamide pathways, revealing cases of unusual bioactive compound structure and enzymology, respectively. Together, this research demonstrates how microbiomics-driven strategies can enable the investigation of previously undescribed enzymes and natural products in underexplored microbial groups and environmentsThis work was supported by funding from the ETH and the Helmut Horten Foundation; the Swiss National Science Foundation (SNSF) through project grants 205321_184955 to S.S., 205320_185077 to J.P. and the NCCR Microbiomes (51NF40_180575) to S.S.; by the Gordon and Betty Moore Foundation (https://doi.org/10.37807/GBMF9204) and the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 101000392 (MARBLES) to J.P.; by an ETH research grant ETH-21 18-2 to J.P.; and by the Peter and Traudl Engelhorn Foundation and by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement no. 897571 to C.C.F. S.L.R. was supported by an ETH Zurich postdoctoral fellowship 20-1 FEL-07. M.L., L.M.C. and G.Z. were supported by EMBL Core Funding and the German Research Foundation (DFG, Deutsche Forschungsgemeinschaft, project no. 395357507, SFB 1371 to G.Z.). M.B.S. was supported by the NSF grant OCE#1829831. C.B. was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement Diatomic, no. 835067). S.G.A. was supported by the Spanish Ministry of Economy and Competitiveness (PID2020-116489RB-I00). M.K. and H.M. were funded by the SNSF grant 407540_167331 as part of the Swiss National Research Programme 75 ‘Big Data’. M.K., H.M. and A.K. are also partially funded by ETH core funding (to G. Rätsch)With the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

Repository for Publications and Research Data

HAL Evry

Hal - Université Grenoble Alpes

MIBiG 3.0 : a community-driven effort to annotate experimentally validated biosynthetic gene clusters

Author: Aguilar César
Al-Salihi Suhad A.A.
Alanjary Mohammad
Aleti Gajender
Augustijn Hannah E.
Avalon Nicole E.
Avelar-Rivas J. Abraham
Avitia-Domínguez Luis A.
Balaya Rex Devasahayam Arokia
Barona-Gómez Francisco
Bernaldo-Agüero Jordan
Bielinski Vincent A.
Biermann Friederike
Blin Kai
Booth Thomas J.
Carrion Bravo Victor J.
Castelo-Branco Raquel
Chagas Fernanda O.
Chevrette Marc G.
Collemare Jérôme
Cruz-Morales Pablo
Du Chao
Duncan Katherine R.
Egbert Susan
Gavriilidou Athina
Gayrard Damien
Gutiérrez-García Karina
Haslinger Kristina
Helfrich Eric J.N.
Jati Afif P.
Kalkreuter Edward
Kalyvas Nikolaos
Kang Kyo B.
Kautsar Satria
Kim Wonyong
Kunjapur Aditya M.
Lee Sanghoon
Li Yong-Xin
Lin Geng-Min
Linington Roger G.
Loureiro Catarina
Louwen Joris J.R.
Louwen Nico L.L.
Lund George
Medema Marnix H.
Meijer David
Navarro-Muñoz Jorge C.
Parra Jonathan
Philmus Benjamin
Pourmohsenin Bita
Pronk Lotte J.U.
Recchia Michael J.J.
Rego Adriana
Reitz Zachary L.
Robinson Serina
Rosas-Becerra L. Rodrigo
Roxborough Eve T.
Schorn Michelle A.
Scobie Darren J.
Selem-Mojica Nelly
Singh Kumar Saurabh
Sokolova Nika
Tang Xiaoyu
Terlouw Barbara R.
Tørring Thomas
Udwary Daniel
van der Hooft Justin J.J.
van Santen Jeffrey A.
Vigneshwari Aruna
Vind Kristiina
Vromans Sophie P.J.M.
Waschulin Valentin
Weber Tilmann
Williams Sam E.
Winter Jaclyn M.
Witte Thomas E.
Xie Huali
Yang Dong
Yu Jingwei
Zaroubi Liana
Zdouc Mitja
Zhong Zheng
Publication venue
Publication date: 18/11/2022
Field of study

With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database up-to-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/

University of Strathclyde Institutional Repository

ZENODO

Warwick Research Archives Portal Repository

Online Research Database In Technology

Explore Bristol Research

Mapping natural product diversity through genomics

Author: Kautsar Satria A.
Publication venue: 'Wageningen University and Research'
Publication date: 01/01/2021
Field of study

Background: Natural products (NP) from plants and microbes are a rich source for bioactive compounds essential for human life. A large part of agriculture, lifestyle and healthcare practice relies on metabolites derived from natural sources. To examine the biosynthetic potential of organisms and to guide NP discovery efforts, people increasingly utilise metabolomic, transcriptomic and genomic approaches. The co-location of metabolic genes in microbial genomes (termed Biosynthetic Gene Cluster or BGC) paves a way for an inexpensive and high throughput survey of natural products. While a focused scope analysis that targets a specific family of known compound chemistry was proven successful to optimize the compound’s utility, a truly global overview which will open our eyes to the actual extent of novel chemistries lies unexplored in nature is still hampered by the limitation of (high quality) data, techniques and bioinformatic tools that are currently available. Results: A computational prediction tool PlantiSMASH was made to enable the exploration of putative plant BGCs, which combines genomic and transcriptomic data to give insights into plant secondary metabolism and evolution. To support large scale annotation and analysis of BGCs, a reference database of known BGC (MIBiG) was markedly improved both in quality and quantity, providing a 73% data increase over its initial release version. A large-scale study of BGC and Gene Cluster Family (GCF) diversity across taxa was done, enabled by the development of a novel bioinformatics tool which can process 1.2 million BGCs within ten days of computing time. Finally, an online database of more than 25,000 GCFs was released for the first time, giving means to the community to do crowdsourced curation, which in turn would come back and be useful in the annotation and discovery of putative or novel BGCs of their own. Conclusions: The works presented in this thesis provide the foundation for a global diversity-informed NP discovery efforts. Research aimed to discover novel products from nature can now have a better compass to guide its direction moving forward. With the increasing accessibility of long reads sequencing technology, it is now possible to do biosynthetic discovery and functional analysis of microbial BGCs straight from the environment. Combined with the continual improvement of metabolomics, this work will be the half-piece of a truly global and large scale meta analysis, linking genomes and metabolites present in the microbial world. Finally, although our work didn’t shift the established notion that BGCs are an exclusive feature of prokaryotic genomes, we find that there is still some level of genome organization in plants which could be useful in biosynthetic pathways analysis, especially when combined with transcriptomics dat

BiG-FAM: the biosynthetic gene cluster families database

Author: Blin Kai
Kautsar Satria A
Medema Marnix H.
Shaw Simon
Weber Tilmann
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Computational analysis of biosynthetic gene clusters (BGCs) has revolutionized natural product discovery by enabling the rapid investigation of secondary metabolic potential within microbial genome sequences. Grouping homologous BGCs into Gene Cluster Families (GCFs) facilitates mapping their architectural and taxonomic diversity and provides insights into the novelty of putative BGCs, through dereplication with BGCs of known function. While multiple databases exist for exploring BGCs from publicly available data, no public resources exist that focus on GCF relationships. Here, we present BiG-FAM, a database of 29,955 GCFs capturing the global diversity of 1,225,071 BGCs predicted from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs). The database offers rich functionalities, such as multi-criterion GCF searches, direct links to BGC databases such as antiSMASH-DB, and rapid GCF annotation of user-supplied BGCs from antiSMASH results. BiG-FAM can be accessed online at https://bigfam.bioinformatics.nl

Online Research Database In Technology

The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes

Author: Blin Kai
Kautsar Satria A.
Medema Marnix H
Shaw Simon
Weber Tilmann
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task. Here, we present version 3 of the antiSMASH database, providing a means to access and query precomputed antiSMASH-5.2-detected biosynthetic gene clusters from representative, publicly available, high-quality microbial genomes via an interactive graphical user interface. In version 3, the database contains 147 517 high quality BGC regions from 388 archaeal, 25 236 bacterial and 177 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/

Online Research Database In Technology

BiG-SLiCE: A highly scalable tool maps the diversity of 1.2 million biosynthetic gene clusters

Author: Hooft van der, Justin J.J.
Kautsar Satria A.
Medema Marnix H.
Ridder de, Dick
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

BACKGROUND: Genome mining for biosynthetic gene clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities. However, current tools are hindered by a bottleneck caused by the expensive network-based approach used to group these BGCs into gene cluster families (GCFs). RESULTS: Here, we introduce BiG-SLiCE, a tool designed to cluster massive numbers of BGCs. By representing them in Euclidean space, BiG-SLiCE can group BGCs into GCFs in a non-pairwise, near-linear fashion. We used BiG-SLiCE to analyze 1,225,071 BGCs collected from 209,206 publicly available microbial genomes and metagenome-assembled genomes within 10 days on a typical 36-core CPU server. We demonstrate the utility of such analyses by reconstructing a global map of secondary metabolic diversity across taxonomy to identify uncharted biosynthetic potential. BiG-SLiCE also provides a "query mode" that can efficiently place newly sequenced BGCs into previously computed GCFs, plus a powerful output visualization engine that facilitates user-friendly data exploration. CONCLUSIONS: BiG-SLiCE opens up new possibilities to accelerate natural product discovery and offers a first step towards constructing a global and searchable interconnected network of BGCs. As more genomes are sequenced from understudied taxa, more information can be mined to highlight their potentially novel chemistry. BiG-SLiCE is available via https://github.com/medema-group/bigslice.</p

Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes

Author: Gavriilidou Athina
Kautsar Satria A.
Krug Daniel
Medema Marnix H.
Müller Rolf
Zaburannyi Nestor
Ziemert Nadine
Publication venue: University of Tübingen
Publication date: 01/01/2022
Field of study

Bacterial specialized metabolites are a proven source of antibiotics and cancer therapeutics, but whether we have sampled all the secondary metabolite chemical diversity of cultivated bacteria is not known. We analysed ~ 170,000 bacterial genomes and ~ 47,000 metagenome assembled genomes (MAGs) using a modified BiG-SLiCE and the new clust-o-matic algorithm. We found that only 3% of the natural products potentially encoded in bacterial genomes have been experimentally characterized. We show that the variation of secondary metabolite biosynthetic diversity drops significantly on a genus level, identifying it as an appropriate taxonomic rank for comparison. Equal comparison of genera based on Relative Evolutionary Distance revealed that Streptomyces bacteria encode the largest biosynthetic diversity by far, with Amycolatopsis, Kutzneria and Micromonospora also encoding substantial chemical diversity. Finally we find that several less-well-studied taxa such as Weeksellaceae (Bacteroidota), Myxococcaceae (Myxococcota), Pleurocapsa and Nostocaceae (Cyanobacteria) have potential to produce highly diverse secondary metabolites that warrant further investigation